Phrase Prioritization Algorithm and Supporting Data Structure for Retrieval

نویسندگان

  • Sachin Kumar
  • Pratishtha Gupta
  • Abraham B. Shani
  • Richard W. Woodman
  • William A. Pasmore
چکیده

In the heart of this research work lies the proposed algorithm, which prioritizes the phrases of the search queries. This algorithm suggests the methodology of fetching phrases and then searching all possible phrases, so that recall value can be increased. The most important issue in this regard is the usage of such data structure, which facilitates the efficient search of phrases in documents. For this purpose, Linked Representation of Sparse Matrix has been suggested, which consists of linked lists not only rowwise but also columnwise. Columns correspond to the documents and hence make the search of every possible phrase efficient. Rows correspond to the dictionary of words. Linked Representation maintain the dynamic nature of documents as well as insertion and deletion of words from the documents.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

ارائۀ راهکاری قاعده‌مند جهت تبدیل خودکار درخت تجزیۀ نحوی وابستگی به درخت تجزیۀ نحوی ساخت‌سازه‌ای برای زبان فارسی

In this paper, an automatic method in converting a dependency parse tree into an equivalent phrase structure one, is introduced for the Persian language. In first step, a rule-based algorithm was designed. Then, Persian specific dependency-to-phrase structure conversion rules merged to the algorithm. Subsequently, the Persian dependency treebank with about 30,000 sentences was used as an input ...

متن کامل

Phrase Based Document Retrieving by Combining Suffix Tree index data structure and Boyer- Moore faster string searching algorithm

Phrase has been considered as a more informative feature term for improving the effectiveness of document retrieval .This paper propose an Algorithm A Phrase Based Document Retrieval to retrieve the similar documents by combining two exiting algorithm suffix tree ,index data structure and “The Boyer-Moore Algorithm”, faster string searching algorithm. The suffix tree is constructed based on E. ...

متن کامل

تولید درخت بانک سازه‌ای زبان فارسی به روش تبدیل خودکار

Treebanks is one of important and useful resource in Natural Language Processing tasks. Dependency and phrase structures are two famous kinds of treebanks. There have already made many efforts to convert dependency structure to phrase structure. In this paper we study an approach to convert dependency structure to phrase structure because of lack of a big phrase structure Treebank in Persian. A...

متن کامل

Phrase Table Training for Precision and Recall: What Makes a Good Phrase and a Good Phrase Pair?

In this work, the problem of extracting phrase translation is formulated as an information retrieval process implemented with a log-linear model aiming for a balanced precision and recall. We present a generic phrase training algorithm which is parameterized with feature functions and can be optimized jointly with the translation engine to directly maximize the end-to-end system performance. Mu...

متن کامل

Investigating Embedded Question Reuse in Question Answering

The investigation presented in this paper is a novel method in question answering (QA) that enables a QA system to gain performance through reuse of information in the answer to one question to answer another related question. Our analysis shows that a pair of question in a general open domain QA can have embedding relation through their mentions of noun phrase expressions. We present methods f...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2015